Overview

Dataset statistics

Number of variables12
Number of observations56
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.4 KiB
Average record size in memory98.3 B

Variable types

Numeric12

Warnings

Train Number is highly correlated with Average passengers per day non peak season and 6 other fieldsHigh correlation
Average passengers per day non peak season is highly correlated with Train Number and 10 other fieldsHigh correlation
Average Kms Per Day is highly correlated with Average passengers per day non peak season and 6 other fieldsHigh correlation
Yearly Passenger In Million is highly correlated with Train Number and 10 other fieldsHigh correlation
Passenger Kilometers is highly correlated with Train Number and 10 other fieldsHigh correlation
Fuel Consumption in Litres is highly correlated with Average passengers per day non peak season and 7 other fieldsHigh correlation
Electricity Consumption in Units is highly correlated with Average passengers per day non peak season and 8 other fieldsHigh correlation
Average Lead Distance is highly correlated with Train Number and 6 other fieldsHigh correlation
Average Time Delay in Minutes Yearly is highly correlated with Train Number and 8 other fieldsHigh correlation
Average Lead Time in Mins Yearly is highly correlated with Train Number and 10 other fieldsHigh correlation
Earnings in Crs is highly correlated with Average passengers per day non peak season and 6 other fieldsHigh correlation
Average rate per passenger km in paise is highly correlated with Train Number and 7 other fieldsHigh correlation
Train Number is uniformly distributed Uniform
Train Number has unique values Unique
Yearly Passenger In Million has unique values Unique
Passenger Kilometers has unique values Unique
Fuel Consumption in Litres has unique values Unique
Electricity Consumption in Units has unique values Unique
Average Lead Time in Mins Yearly has unique values Unique
Earnings in Crs has unique values Unique

Reproduction

Analysis started2021-03-11 12:02:21.585117
Analysis finished2021-03-11 12:02:44.144584
Duration22.56 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

Train Number
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct56
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1258.5
Minimum1231
Maximum1286
Zeros0
Zeros (%)0.0%
Memory size576.0 B
2021-03-11T17:32:44.270987image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1231
5-th percentile1233.75
Q11244.75
median1258.5
Q31272.25
95-th percentile1283.25
Maximum1286
Range55
Interquartile range (IQR)27.5

Descriptive statistics

Standard deviation16.30950643
Coefficient of variation (CV)0.01295948068
Kurtosis-1.2
Mean1258.5
Median Absolute Deviation (MAD)14
Skewness0
Sum70476
Variance266
MonotocityStrictly increasing
2021-03-11T17:32:44.464617image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12801
 
1.8%
12811
 
1.8%
12541
 
1.8%
12551
 
1.8%
12561
 
1.8%
12571
 
1.8%
12581
 
1.8%
12591
 
1.8%
12601
 
1.8%
12611
 
1.8%
Other values (46)46
82.1%
ValueCountFrequency (%)
12311
1.8%
12321
1.8%
12331
1.8%
12341
1.8%
12351
1.8%
ValueCountFrequency (%)
12861
1.8%
12851
1.8%
12841
1.8%
12831
1.8%
12821
1.8%

Average passengers per day non peak season
Real number (ℝ≥0)

HIGH CORRELATION

Distinct55
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2178.339286
Minimum412
Maximum4552
Zeros0
Zeros (%)0.0%
Memory size576.0 B
2021-03-11T17:32:44.632196image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum412
5-th percentile743
Q11348.5
median2046.5
Q32793.5
95-th percentile4140
Maximum4552
Range4140
Interquartile range (IQR)1445

Descriptive statistics

Standard deviation1048.958675
Coefficient of variation (CV)0.4815405394
Kurtosis-0.3423830251
Mean2178.339286
Median Absolute Deviation (MAD)748
Skewness0.4873672491
Sum121987
Variance1100314.301
MonotocityNot monotonic
2021-03-11T17:32:44.804062image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18842
 
3.6%
33291
 
1.8%
31781
 
1.8%
11061
 
1.8%
22591
 
1.8%
20051
 
1.8%
38021
 
1.8%
13731
 
1.8%
6801
 
1.8%
20171
 
1.8%
Other values (45)45
80.4%
ValueCountFrequency (%)
4121
1.8%
4991
1.8%
6801
1.8%
7641
1.8%
8081
1.8%
ValueCountFrequency (%)
45521
1.8%
44771
1.8%
43771
1.8%
40611
1.8%
38761
1.8%

Average Kms Per Day
Real number (ℝ≥0)

HIGH CORRELATION

Distinct55
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1738.214286
Minimum776
Maximum3944
Zeros0
Zeros (%)0.0%
Memory size576.0 B
2021-03-11T17:32:44.979480image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum776
5-th percentile924.5
Q11213.75
median1562
Q31853.5
95-th percentile3653.75
Maximum3944
Range3168
Interquartile range (IQR)639.75

Descriptive statistics

Standard deviation786.3194982
Coefficient of variation (CV)0.4523720146
Kurtosis1.74970066
Mean1738.214286
Median Absolute Deviation (MAD)351.5
Skewness1.527329883
Sum97340
Variance618298.3532
MonotocityNot monotonic
2021-03-11T17:32:45.147758image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16372
 
3.6%
31181
 
1.8%
23961
 
1.8%
16021
 
1.8%
39441
 
1.8%
14851
 
1.8%
16061
 
1.8%
16131
 
1.8%
21261
 
1.8%
17431
 
1.8%
Other values (45)45
80.4%
ValueCountFrequency (%)
7761
1.8%
8721
1.8%
9141
1.8%
9281
1.8%
9421
1.8%
ValueCountFrequency (%)
39441
1.8%
38471
1.8%
38451
1.8%
35901
1.8%
33701
1.8%

Yearly Passenger In Million
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct56
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3916.553571
Minimum1275
Maximum8421
Zeros0
Zeros (%)0.0%
Memory size576.0 B
2021-03-11T17:32:45.335214image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1275
5-th percentile1667.5
Q12509.75
median3654
Q34647
95-th percentile7794.25
Maximum8421
Range7146
Interquartile range (IQR)2137.25

Descriptive statistics

Standard deviation1810.605905
Coefficient of variation (CV)0.4622957076
Kurtosis0.450576713
Mean3916.553571
Median Absolute Deviation (MAD)1148.5
Skewness0.9404270775
Sum219327
Variance3278293.743
MonotocityNot monotonic
2021-03-11T17:32:45.507097image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
29451
 
1.8%
53781
 
1.8%
19921
 
1.8%
62191
 
1.8%
83971
 
1.8%
72461
 
1.8%
18721
 
1.8%
37921
 
1.8%
22571
 
1.8%
40491
 
1.8%
Other values (46)46
82.1%
ValueCountFrequency (%)
12751
1.8%
12841
1.8%
15941
1.8%
16921
1.8%
17501
1.8%
ValueCountFrequency (%)
84211
1.8%
83971
1.8%
82241
1.8%
76511
1.8%
72461
1.8%

Passenger Kilometers
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct56
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean59683.39286
Minimum6551
Maximum168589
Zeros0
Zeros (%)0.0%
Memory size576.0 B
2021-03-11T17:32:45.788284image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum6551
5-th percentile12893.5
Q126009.5
median47134
Q386017.5
95-th percentile138859.5
Maximum168589
Range162038
Interquartile range (IQR)60008

Descriptive statistics

Standard deviation41027.75023
Coefficient of variation (CV)0.6874232222
Kurtosis-0.1690254382
Mean59683.39286
Median Absolute Deviation (MAD)27842.5
Skewness0.8136218146
Sum3342270
Variance1683276289
MonotocityNot monotonic
2021-03-11T17:32:45.978745image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
280371
 
1.8%
394331
 
1.8%
630451
 
1.8%
519121
 
1.8%
387301
 
1.8%
732921
 
1.8%
1037591
 
1.8%
221631
 
1.8%
132681
 
1.8%
850661
 
1.8%
Other values (46)46
82.1%
ValueCountFrequency (%)
65511
1.8%
81651
1.8%
117701
1.8%
132681
1.8%
135611
1.8%
ValueCountFrequency (%)
1685891
1.8%
1456541
1.8%
1440571
1.8%
1371271
1.8%
1309171
1.8%

Fuel Consumption in Litres
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct56
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean282103.5893
Minimum54235
Maximum990153
Zeros0
Zeros (%)0.0%
Memory size576.0 B
2021-03-11T17:32:46.163942image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum54235
5-th percentile67936.5
Q1100583.5
median201615.5
Q3351237.5
95-th percentile856652
Maximum990153
Range935918
Interquartile range (IQR)250654

Descriptive statistics

Standard deviation246202.4754
Coefficient of variation (CV)0.8727378337
Kurtosis1.543877476
Mean282103.5893
Median Absolute Deviation (MAD)108472
Skewness1.532670644
Sum15797801
Variance6.061565889 × 1010
MonotocityNot monotonic
2021-03-11T17:32:46.335777image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3456001
 
1.8%
874251
 
1.8%
1158991
 
1.8%
599661
 
1.8%
2360661
 
1.8%
9024651
 
1.8%
7725481
 
1.8%
1808081
 
1.8%
9901531
 
1.8%
4247781
 
1.8%
Other values (46)46
82.1%
ValueCountFrequency (%)
542351
1.8%
599661
1.8%
658951
1.8%
686171
1.8%
704301
1.8%
ValueCountFrequency (%)
9901531
1.8%
9524491
1.8%
9024651
1.8%
8413811
1.8%
7725481
1.8%

Electricity Consumption in Units
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct56
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean341786.9821
Minimum62400
Maximum1158742
Zeros0
Zeros (%)0.0%
Memory size576.0 B
2021-03-11T17:32:46.521924image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum62400
5-th percentile80830
Q1126022.75
median248574.5
Q3437255
95-th percentile995511.5
Maximum1158742
Range1096342
Interquartile range (IQR)311232.25

Descriptive statistics

Standard deviation286158.7887
Coefficient of variation (CV)0.8372430889
Kurtosis1.274785716
Mean341786.9821
Median Absolute Deviation (MAD)133257.5
Skewness1.43385896
Sum19140071
Variance8.188685234 × 1010
MonotocityNot monotonic
2021-03-11T17:32:46.709344image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1021451
 
1.8%
4909121
 
1.8%
11587421
 
1.8%
3001031
 
1.8%
4306661
 
1.8%
9785081
 
1.8%
2693891
 
1.8%
665171
 
1.8%
5757021
 
1.8%
2229351
 
1.8%
Other values (46)46
82.1%
ValueCountFrequency (%)
624001
1.8%
665171
1.8%
776651
1.8%
818851
1.8%
839911
1.8%
ValueCountFrequency (%)
11587421
1.8%
10981031
1.8%
10465221
1.8%
9785081
1.8%
9034651
1.8%

Average Lead Distance
Real number (ℝ≥0)

HIGH CORRELATION

Distinct48
Distinct (%)85.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.65714286
Minimum15.9
Maximum37
Zeros0
Zeros (%)0.0%
Memory size576.0 B
2021-03-11T17:32:46.928044image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum15.9
5-th percentile16.55
Q119.175
median24
Q330.85
95-th percentile33.125
Maximum37
Range21.1
Interquartile range (IQR)11.675

Descriptive statistics

Standard deviation6.136651653
Coefficient of variation (CV)0.2488792675
Kurtosis-1.390149015
Mean24.65714286
Median Absolute Deviation (MAD)5.7
Skewness0.177715333
Sum1380.8
Variance37.65849351
MonotocityNot monotonic
2021-03-11T17:32:47.113584image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
29.72
 
3.6%
242
 
3.6%
33.82
 
3.6%
32.52
 
3.6%
20.42
 
3.6%
312
 
3.6%
16.42
 
3.6%
20.62
 
3.6%
19.11
 
1.8%
29.51
 
1.8%
Other values (38)38
67.9%
ValueCountFrequency (%)
15.91
1.8%
16.42
3.6%
16.61
1.8%
16.81
1.8%
16.91
1.8%
ValueCountFrequency (%)
371
1.8%
33.82
3.6%
32.91
1.8%
32.81
1.8%
32.71
1.8%

Average Time Delay in Minutes Yearly
Real number (ℝ≥0)

HIGH CORRELATION

Distinct55
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean139.5785714
Minimum68.8
Maximum257.5
Zeros0
Zeros (%)0.0%
Memory size576.0 B
2021-03-11T17:32:47.316635image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum68.8
5-th percentile73.45
Q183.275
median128.95
Q3187.55
95-th percentile234.45
Maximum257.5
Range188.7
Interquartile range (IQR)104.275

Descriptive statistics

Standard deviation58.94213407
Coefficient of variation (CV)0.4222864116
Kurtosis-1.276494855
Mean139.5785714
Median Absolute Deviation (MAD)50.75
Skewness0.3895009584
Sum7816.4
Variance3474.175169
MonotocityNot monotonic
2021-03-11T17:32:47.518778image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
74.82
 
3.6%
72.11
 
1.8%
169.31
 
1.8%
74.41
 
1.8%
141.61
 
1.8%
99.81
 
1.8%
1781
 
1.8%
208.61
 
1.8%
147.61
 
1.8%
69.91
 
1.8%
Other values (45)45
80.4%
ValueCountFrequency (%)
68.81
1.8%
69.91
1.8%
72.11
1.8%
73.91
1.8%
74.41
1.8%
ValueCountFrequency (%)
257.51
1.8%
241.51
1.8%
234.61
1.8%
234.41
1.8%
229.31
1.8%

Average Lead Time in Mins Yearly
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct56
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean75.36428571
Minimum46.2
Maximum138
Zeros0
Zeros (%)0.0%
Memory size576.0 B
2021-03-11T17:32:47.694659image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum46.2
5-th percentile47.2
Q150.125
median70.55
Q394.075
95-th percentile127.375
Maximum138
Range91.8
Interquartile range (IQR)43.95

Descriptive statistics

Standard deviation27.47692869
Coefficient of variation (CV)0.3645881923
Kurtosis-0.6990821697
Mean75.36428571
Median Absolute Deviation (MAD)21.1
Skewness0.7086805295
Sum4220.4
Variance754.9816104
MonotocityNot monotonic
2021-03-11T17:32:47.974874image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
46.61
 
1.8%
671
 
1.8%
71.41
 
1.8%
861
 
1.8%
47.31
 
1.8%
62.11
 
1.8%
70.11
 
1.8%
681
 
1.8%
94.61
 
1.8%
1181
 
1.8%
Other values (46)46
82.1%
ValueCountFrequency (%)
46.21
1.8%
46.61
1.8%
46.91
1.8%
47.31
1.8%
47.41
1.8%
ValueCountFrequency (%)
1381
1.8%
130.41
1.8%
127.91
1.8%
127.21
1.8%
124.71
1.8%

Earnings in Crs
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct56
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6504.748214
Minimum98.2
Maximum36532.3
Zeros0
Zeros (%)0.0%
Memory size576.0 B
2021-03-11T17:32:48.134604image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum98.2
5-th percentile146.075
Q1337.875
median1829.55
Q39787.5
95-th percentile26340.8
Maximum36532.3
Range36434.1
Interquartile range (IQR)9449.625

Descriptive statistics

Standard deviation9088.261979
Coefficient of variation (CV)1.397173523
Kurtosis2.133688911
Mean6504.748214
Median Absolute Deviation (MAD)1652.35
Skewness1.684436524
Sum364265.9
Variance82596505.8
MonotocityStrictly increasing
2021-03-11T17:32:48.325135image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25705.61
 
1.8%
31322.81
 
1.8%
827.51
 
1.8%
320.11
 
1.8%
15080.81
 
1.8%
1939.71
 
1.8%
278.91
 
1.8%
199.31
 
1.8%
185.21
 
1.8%
98.21
 
1.8%
Other values (46)46
82.1%
ValueCountFrequency (%)
98.21
1.8%
107.71
1.8%
131.61
1.8%
150.91
1.8%
169.21
1.8%
ValueCountFrequency (%)
36532.31
1.8%
31322.81
1.8%
28246.41
1.8%
25705.61
1.8%
23414.41
1.8%

Average rate per passenger km in paise
Real number (ℝ≥0)

HIGH CORRELATION

Distinct55
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.65125
Minimum1.48
Maximum31.5
Zeros0
Zeros (%)0.0%
Memory size576.0 B
2021-03-11T17:32:48.502199image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1.48
5-th percentile1.8175
Q12.565
median7.355
Q322.3675
95-th percentile26.475
Maximum31.5
Range30.02
Interquartile range (IQR)19.8025

Descriptive statistics

Standard deviation9.851850599
Coefficient of variation (CV)0.8455616864
Kurtosis-1.393659295
Mean11.65125
Median Absolute Deviation (MAD)5.245
Skewness0.5208817919
Sum652.47
Variance97.05896023
MonotocityNot monotonic
2021-03-11T17:32:48.680677image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24.52
 
3.6%
17.091
 
1.8%
24.31
 
1.8%
1.851
 
1.8%
26.31
 
1.8%
2.51
 
1.8%
10.641
 
1.8%
3.971
 
1.8%
2.131
 
1.8%
7.561
 
1.8%
Other values (45)45
80.4%
ValueCountFrequency (%)
1.481
1.8%
1.711
1.8%
1.721
1.8%
1.851
1.8%
2.011
1.8%
ValueCountFrequency (%)
31.51
1.8%
28.51
1.8%
271
1.8%
26.31
1.8%
26.11
1.8%

Interactions

2021-03-11T17:32:23.552076image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:23.743821image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:23.901687image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:24.062405image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:24.220941image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:24.375710image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:24.588618image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:24.747606image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:24.885344image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:25.056896image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:25.269622image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:25.537400image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:25.706772image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:25.882125image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:26.036743image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:26.196579image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:26.348487image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:26.511794image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:26.655288image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:26.807223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:26.936292image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:27.069684image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:27.210263image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:27.361451image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:27.502285image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:27.647991image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:27.788585image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:27.938983image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:28.085245image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:28.225835image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:28.366428image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:28.615504image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:28.756093image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:28.896689image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:29.037254image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:29.177846image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:29.334062image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:29.493591image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:29.631950image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:29.772543image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:29.913136image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:30.070844image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:30.214032image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:30.370243image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:30.519131image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:30.664123image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:30.804716image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:30.971497image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:31.120486image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:31.261075image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:31.417292image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:31.581268image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:31.729604image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:31.870195image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:32.018195image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:32.293853image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:32.434474image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:32.581278image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:32.717902image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:32.858493image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:33.017958image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:33.158579image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:33.296810image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:33.437403image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:33.580703image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:33.729009image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:33.861418image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:34.001687image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:34.142313image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:34.284219image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:34.440470image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:34.579736image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:34.720327image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:34.864509image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:35.016315image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:35.156929image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:35.315135image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:35.462073image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:35.593798image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:35.718767image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:35.981338image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:36.109767image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:36.250359image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:36.397713image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:36.530127image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:36.670723image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:36.811313image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:36.969781image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:37.108104image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:37.248695image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:37.389291image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:37.545903image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:37.686481image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:37.842690image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:37.997757image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:38.138380image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:38.294597image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:38.435189image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:38.592360image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:38.732957image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:38.873549image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:39.014346image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:39.157029image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:39.297619image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:39.438213image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:39.688010image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:39.828602image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:39.972223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:40.112175image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:40.252803image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:40.393399image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:40.542109image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:40.676391image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:40.832604image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:40.985601image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:41.140075image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:41.271761image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:41.427973image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:41.560339image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:41.726747image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:41.867375image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:42.010251image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:42.150875image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:42.305662image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:42.446285image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:42.604231image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:42.759029image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:42.885090image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:43.039857image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:43.180447image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:43.433729image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-11T17:32:43.586733image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-03-11T17:32:48.836892image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-03-11T17:32:49.084457image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-03-11T17:32:49.324525image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-03-11T17:32:49.564189image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-03-11T17:32:43.823699image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-03-11T17:32:44.039139image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

Train NumberAverage passengers per day non peak seasonAverage Kms Per DayYearly Passenger In MillionPassenger KilometersFuel Consumption in LitresElectricity Consumption in UnitsAverage Lead DistanceAverage Time Delay in Minutes YearlyAverage Lead Time in Mins YearlyEarnings in CrsAverage rate per passenger km in paise
0123141287212846551599666651715.968.851.898.21.48
1123249977612758165542356240016.469.948.9107.71.72
21233680914159411770658957766517.372.148.7131.61.71
31234764928169213268686178188517.473.948.4150.91.85
41235808942175013561704308399116.874.848.0169.22.01
51236882990187214460741288858816.474.947.3185.22.09
612379541038199215791776989348916.674.846.9199.32.13
7123810181064208217164791309629416.974.446.2219.22.28
81239108111112192184698367610214517.175.346.6229.32.25
91240110611512257190688809510716317.276.547.4252.62.36

Last rows

Train NumberAverage passengers per day non peak seasonAverage Kms Per DayYearly Passenger In MillionPassenger KilometersFuel Consumption in LitresElectricity Consumption in UnitsAverage Lead DistanceAverage Time Delay in Minutes YearlyAverage Lead Time in Mins YearlyEarnings in CrsAverage rate per passenger km in paise
46127731782200537810375947194357570232.7214.5107.014072.524.4
47127833292396572510641950919561561432.0212.6107.515080.824.5
48127935142705621911189758286769476431.8215.5111.717176.024.7
49128036892835652411984265011476995632.5229.3118.019783.325.7
50128138023118692012483671319683803232.8228.7121.121866.526.1
51128238763370724613091777254890346533.8229.2124.723414.425.9
52128340613590765113712784138197850833.8234.4127.925705.626.3
531284437738478224144057902465104652232.9234.6127.228246.427.0
541285447739448421145654952449109810332.5241.5130.431322.828.5
551286455238458397168589990153115874237.0257.5138.036532.331.5